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The dependence of subgrid-scale stresses on variables of the resolved field is stud- 
ied using direct numerical simulations of isotropic turbulence, homogeneous shear 
flow, and channel flow. The projection pursuit algorithm, a promising new regres- 
sion tool for high-dimensional data, is used to systematically search through a large 
collection of resolved variables, such as components of the strain rate, vorticity, 
velocity gradients at neighboring grid points, etc. For the case of isotropic tur- 
bulence, the search algorithm recovers the linear dependence on the rate of strain 
(which is necessary to transfer energy to subgrid scales) but is unable to determine 
any other more complex relationship. For shear flows, however, new systematic 
relations beyond eddy viscosity are found. For the homogeneous shear flow, the 
results suggest that products of the mean rotation rate tensor with both the fluc- 
tuating strain rate and fluctuating rotation rate tensors are important quantities 
in parameterizing the subgrid-scale stresses. A model incorporating these terms 
is proposed. When evaluated with direct numerical simulation data, this model 
significantly increases the correlation between the modeled and exact stresses, as 
compared with the Smagorinsky model. In the case of channel flow, the stresses are 
found to correlate with products of the fluctuating strain and rotation rate tensors. 
The mean rates of rotation or strain do not appear to be important in this case, 
and the model determined for homogeneous shear flow does not perform well when 
tested with channel flow data. Many questions remain about the physical mecha- 
nisms underlying these findings, about possible Reynolds number dependence, and, 
given the low level of correlations, about their impact on modeling. Neverthe- 
less, demonstration of the existence of causal relations between sgs stresses and 
large-scale characteristics of turbulent shear flows, in addition to those necessary 
for energy transfer, provides important insight into the relation between scales in 
turbulent flows. 


1. Introduction 

Of central importance to the numerical simulation of the large scales in turbu- 
lent flows is the proper parameterization of the subgrid-scale (sgs) stress deviator, 
defined as 


Tij = U t Uj - UiUj - -(ttfcUjfc - UkUk) 6 iji 


( 1 ) 
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as a function of the resolved velocity field Hi. Here () represents spatial filtering at 
a particular scale r. The most widely used model is Smagorinsky ’s (1963): 


where 


= -2 C] 


25j j Sij , 



dui_ duy 
v dxj dxi ’ 


( 2 ) 

( 3 ) 


is the strain rate of the resolved motion. The model constant C s can be prescribed 
or can be determined dynamically based on information provided by the resolved 
field, as in the recently developed dynamic model (Germano et a/., 1991). 

Although the Smagorinsky model has been in use for nearly thirty years, for 
roughly half that period it has been known that the model provides only a very 
crude estimate for the stresses. This fact was first demonstrated by Clark et al. 
(1979), where direct numerical simulation (DNS) data for homogeneous isotropic 
turbulence was used to evaluate model predictions. Clark et al. found a correlation 
coefficient of approximately 0.2 when comparing predictions of the Smagorinsky 
model with the exact stresses. McMillan et al (1979) found that the correlation 
coefficient was even lower in homogeneous shear flow, being of order 0.1. Later, 
Piomelli et al (1988) found similar results in turbulent channel flow. 

When contemplating these extremely low correlation coefficients, it may seem 
striking that the Smagorinsky model works at all. Of course, the resolution of this 
paradox is that, by construction, the Smagorinsky model insures that there will 
be a net drain of energy from the large scales to the subgrid-scale motions. This 
is the primary objective of a subgrid-scale model, and as long as this requirement 
is met, reasonable results may be expected. On the other hand, the Smagorinsky 
model provides poor predictions of the individual elements of the stress tensor. It is 
natural to expect that superior results could be obtained with a model that yields a 
more accurate prediction of the stress tensor. The objective of this work is to seek 
out potentially more accurate models. 

The Smagorinsky model relates the subgrid-scale stress with only the resolved 
strain rate. It is reasonable to expect that the stresses might also depend on other 
resolved quantities such as the vorticity. If simple models based on a limited number 
of such quantities are postulated, conventional least-squares fitting techniques can 
be used to test the modeling hypothesis. Such a test was performed by Lund and 
Novikov (1992), where the stresses were assumed to depend on the anti-symmetric 
as well as the symmetric part of the velocity gradient tensor (rotation rate and 
strain rate tensors, respectively). It was shown that the stress tensor could be 
expanded in a series formed from products of these two tensors. Tests of this 
expansion in isotropic turbulence revealed that inclusion of rotation rate did not 
significantly improve the model prediction. The results of Lund and Novikov thus 
suggest that it is necessary to search for other quantities on which the stresses could 
depend. Velocity gradients taken at neighboring points or perhaps gradients filtered 
at different (larger) scales are possible candidates which would not violate Galilean 


invariance. 
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Unfortunately, as the list of possible independent variables increases, the task of 
finding statistically meaningful relations from the DNS data becomes unmanage- 
able. In principal, if a multidimensional scatter-plot of r,j as a function of several 
independent variables is generated, a high- dimensional cloud of points would be 
obtained. This may (or may not) exhibit some clustering around a most probable 
behavior. If such a hypersurface exists about which the data appears preferentially 
clustered, it would constitute a clear basis for modeling. However, finding such a 
surface from the DNS data represents a difficult problem of regression in a high- 
dimensional space of variables. Parametric regression, such as least-square error 
fitting to some assumed functional form, is quite difficult because there is little 
indication as to what such a function should be. Finding the surface by dividing 
the high-dimensional space into small hypercubes and performing local smoothing 
of the data is impractical because even large amounts of data become extremely 
sparse in a high- dimensional setting (curse of dimensionality). 

Although the challenges in performing a high-dimensional regression are apparent, 
recent advances in statistical science allow such problems to be tackled. An elegant 
method that circumvents many problems inherent to high-dimensional regression 
was proposed by Friedman & Stuetzle in 1981. Known as the Projection Pursuit 
Regression algorithm, this method was originally developed to analyze experimental 
data in particle physics involving a large number of variables. The algorithm consists 
of a numerical optimization routine that finds one dimensional projections of the 
original independent variables for which the best correlations with the dependent 
variable can be obtained. The dependent variable can then be written as a sum 
of empirically determined functions of the projections. We shall use the projection 
pursuit regression algorithm to investigate relationships between the subgrid-scale 
stresses and quantities in the resolved field. 

In section 2, we briefly summarize the projection pursuit method, present an 
illustrative example, and comment on both its strengths and weaknesses. Section 3 
describes applications to isotropic turbulence, both decaying and forced. Section 4 
presents applications to homogeneous turbulent shear flow and section 5 to channel 
flow simulations. The results obtained from these anisotropic flows suggest possible 
modeling strategies that are explored at the end of sections 4 and 5. Section 6 
summarizes this work and presents the conclusions. 

2. Review of projection pursuit regression 

The problem is to find the ‘best’ relation between a ‘response’ y and a set of 
predictor variables X\, X 2 ,.* x n . In our problem, y will be identified with each of 
the elements of the sgs stress tensor, and the x,’s will be the elements of resolved 
rate of strain, vorticity, etc., i.e. all the variables that the stresses are assumed to 
depend upon. When performing tests with DNS data, there will be a large number 
of realizations (essentially at every grid-point) of the ‘response variable’ r (y) and 
of the ‘predictor variables’ strain rate, vorticity, etc (x,, i = l,2...n). 

Friedman Sz Stuetzle (1981) summarize the inherent problems of traditional meth- 
ods, such as parametric regression and regression based on local smoothing. With 
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the former, one has to assume a particular functional form and determine unknown 
coefficients or parameters by some method such as least-square error fitting. Since 
we do not wish to impose such relationships a priori , this is not a method of choice. 
Local smoothing consists of fitting a hypersurface in a small hypercube of data and 
repeating this in each cube. The regression surface is then the union of all these 
local fits. In high-dimensional settings, this is practically impossible. Consider the 
following example (Friedman & Stuetzle, 1981). Let x G R 10 , i.e. n = 10. If the 
width of the cube used for the local smoothing spans 10% of the range of each 
variable, each cube will contain typically only a fraction equal to 0.1 10 of the data, 
which is too sparse. On the other hand, if one requires each hypercube to contain 
10 % of the data, then the window has to span 0.1° 1 ~ 80% of the range of the 
predictor variables, which is too large. 

Projection pursuit regression (ppreg henceforth) circumvents these difficulties by 
projecting the high- dimensional data onto a single variable z = a \X\ 4- & 2 X 2 + 
... + o n r n . Local smoothing is then performed to obtain an empirically determined 
function f(z) that follows the main trend of the data as a function of z. The 
smoothing algorithm is described in Friedman & Stuetzle (1981) and consists of 
several passes over the data (y as a function of z) to adjust the bandwidth of 
the smoothing to the local conditions. The variance =< (y — f(z)) 2 > — < 
( y — f(z)) > 2 of the data around f(z) is computed. The core of the algorithm 
is a numerical optimization procedure in which the coefficients a* are selected so 
as to minimize the variance <7 2 . Let the a’s thus found be denoted by and 

let z and be the corresponding univariate projection and the empirical 

function giving a good fit for y as a function of z^\ The procedure is repeated 
for the residues, defined as y — and a new projection a[ 2 ^ and a smooth 

empirical function f^ 2 \z ^ ) are found. This procedure is repeated until the variance 
stops to decrease appreciably by adding new projections. Finally, the model consists 
of the sum 


jVf 

y,nod = £ + a[ m) x 2 + .. + a™x n ). (4) 

m — l 

For the case that the response variable is a linear combination of x t (i.e. y = 
f}\X\ + .../? n #n)i ppreg reduces to the usual n-dimensional linear least-square error 
fit (where the a’s are the coefficients and is a linear function). In general 
however, the functions need not be linear. The fundamental advantage of 

this procedure is illustrated in the following example. If y is the product of x’s, 
say y = X 1 X 2 , then this can be represented as a sum of two univariate functions 
according to y = ^(2^) 2 — \{z^) 2 , where z ^ = x\ + x 2 and = x\ — X 2 . The 
ppreg algorithm is thus able to find some nonlinear relations without stipulating 
them a priori. 

As an illustrative example, we consider 1000 realizations of a ten-dimensional 
random vector x where each Xj is normally distributed with zero mean and unit 
variance. Then y is prescribed as follows: 
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y = X3X4 + tanh(x6 + X7) + £, ( 5 ) 

where £ is another Gaussian random variable with zero mean and variance of 0.1. 
However, f is not included in the list of predictor variables and, therefore, repre- 
sents extraneous noise. Projection pursuit is applied to this artificially generated set 
of data. The projections found by ppreg are, successively: = 0.69, = 0.72; 

a< 2) = 0.66, a^ 2) = 0.74; and a^ 3) = -0.67, a$ 3) = -0.73; other a’s are negligible. 
The empirical functions (solid lines) resemble the tanh function in the first projec- 
tion, parabolas in the latter two. The projected data are shown in Figures 1(a) 
to (c). If we least-square error fit a tanh profile through Figure 1(a), we obtain 
y x = 1. ltanh[l. 3(0.69x 6 + 0.72x 7 )]. The scatter plot in Figure 1(b) is then y - y x vs 
the second projection z w = 0.66x 3 + 0.74x 4 . Parabolic fits through Figures 1(b) 
and 1(c) give ?/2 = 0.4(0. 66x3 + 0.74x 4 ) 2 and y 3 = — 0.4(— 0.67x 3 + 0.73x 4 ) 2 . (These 
fits are not exactly equal to the empirical smoothing functions constructed by the 
algorithm, this being the reason why the scatter plot of Figure 1(c) falls below the 
smooth.) The final model then consists of y x + t/2 + J/3 which is plotted with the 
original y in Figure 1(d). 

The residual noise is mainly due to the non-deterministic dependence of y with 
respect to £. The initial correlation coefficient between y and e.g. x 4 was 0.012, 
while the correlation coefficient between y and the model, y mo d, is now p = 0.96. 
Finding such a non- trivial dependence from few data points in a 10- dimensional 
space is quite remarkable. 

Although impressive in the above example, ppreg is not fool-proof. For cases 
when y depends on the x,’s in ways that cannot be written as sums of functions 
of linear combinations of Xj’s (such as divisions), ppreg is usually unable to find 
good projections. Therefore, while the method works remarkably well for an entire 
family of non trivial relations, it cannot be considered entirely general. 

In addition to application to sgs modeling to be reported in the following pages, 
we believe that the ppreg method should be applicable to a host of other prob- 
lems where large amounts of data need to be analyzed and functional dependencies 
established (Reynolds-stress modeling, reacting flows, control, etc.). 

3. Isotropic turbulence 

In this section, ppreg is used to search for possible functional dependence between 
the residual stresses and a host of resolved variables in homogeneous isotropic tur- 
bulence. Both decaying and forced isotropic turbulent fields are considered. 

3.1 Flow-fields and calculations 

Both the forced and decaying isotropic turbulent fields were generated on a 128 3 
mesh with the pseudo-spectral code of Rogallo (1981). For the decaying turbulence, 
the energy spectrum was initialized according to 

m = k (I) e * p H) • 
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FIGURE 1. Illustrative application of ppreg to a test case, (a) Response y as 
a function of first projection z ^ = 0.69x6 + 0.72x7 (symbols) and mean trend 
found by local smoothing (solid line), (b) Second projection of y — y \ , 
where y\ has been found by fitting a tank profile through Figure 1(a). Solid line; 
fW( Z W) found by the algorithm, (c) Third projection and f^\z^). (d) Response 
variable y as a function of the sum of empirically determined fits in (a), (b) and (c). 


This spectrum has its energy peak at wavenumber 8. The initial phases for the com- 
plex velocity field were chosen randomly but in such a way that the divergence-free 
condition was satisfied (see Rogallo, 1981 for more details on the initial conditions). 

In order to develop realistic turbulence from the random phase initial condition, 
the flow was allowed to evolve freely for 2.9 small scale eddy turnover times, r to 
(based on quantities derived from the end of the initial run; r <0 = ^ where Ao and 
u'o are the Taylor microscale and the rms turbulence intensity, respectively). Over 
this period of time, the total turbulent kinetic energy decayed by 34%. The Taylor 
microscale Reynolds number (u f \/v) was 45.3, and the velocity derivative skewness 
was —0.32. The 3-D radial energy spectrum at the end of the run is plotted in 
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Wavenumber, kr} 

FIGURE 2. 3-D radial energy spectrum for the decaying isotropic turbulence, 

plotted in Kolmogorov units. The experimental data were taken from Comte- Bellot 
and Corrsin (1971). The vertical line indicates the scale at which the velocity field 
was filtered to obtain the synthetic large eddy field. 

Kolmogorov units in Figure 2. Also shown are the experimental data of Comte- 
Bellot and Corrsin (1971) at somewhat higher Reynolds number. Agreement with 
the experimental data between 0.06 < kr] < 0.4 indicates that realistic turbulence 
has been achieved. The tail-up in the simulated spectrum at high wavenumbers 
indicates some lack of resolution. It is generally believed (Rogallo, 1992) that this 
will not adversely affect the data in the central portion of the spectrum used here. 
The vertical line in Figure 2 indicates the scale at which the DNS data was filtered 
in order to generate the synthetic large eddy field. This scale corresponds to four 
grid spacings. 

For the forced simulation, energy was added to the large scales by including an 
anti-diffusion term (negative diffusion coefficient) in the Navier-stokes equations. 
The diffusion coefficient was wavenumber dependent and non-zero only for modes 
within wavenumber shells less than 3. The value of the coefficient for low wavenum- 
bers was chosen such that the maximum wavenumber, scaled in Kolmogorov units, 
was unity (i.e. k maz h = 1). To generate realistic statistically stationary turbu- 
lence, the flow was evolved from the random phase initial conditions for approxi- 
mately 2 large scale eddy turn-over times. The Reynolds number, R \ , settled at 
95.8, while the velocity derivative skewness settled at —0.486. The energy spec- 
trum is shown in Figure 3, where again the vertical line indicates the scale used to 
generate the large eddy field. 

The sgs stresses Tjj and resolved rates of strain Sij and vorticity u;* were computed 
using a spectral cut-off filter with scale r corresponding to 4 grid points. The data 
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Wavenumber, JcL/( 2tt) 

FIGURE 3. 3-D radial energy spectrum for the forced isotropic turbulence. The 

vertical line indicates the scale at which the velocity field was filtered to obtain the 
synthetic large eddy field, 

was sampled on every 8th grid point, producing a total of 16 3 realizations. Also 
computed at each 16 3 point were resolved variables at a scale twice as large as r, 
Sij and The invariants of the tensors were computed as follows: 



( 6 ) 

J//g — (SmnSnpSpm)^ y 

( 7 ) 

lift = 

( 8 ) 


and a similar list of invariants for the larger scale rates-of- strain and vorticity. 

The search procedure consisted of considering separately each element of the 
tensor r as the response variable. Each element of r, in turn, was assumed to 
depend on all 24 of the predictor variables mentioned above (each element of the 
tensors plus all invariants). The search is thus a high dimensional one indeed. 

It is important to note that when performing independent searches for each ele- 
ment of r, the resulting model expressions are not expected to be tensorially correct. 
This weakness stems from the fact that the projection pursuit regression operates 
most effectively on scalar data. The findings of projection pursuit are still quite 
valuable, however, since they may be used to guide the construction of tensorially 
correct models. Such a procedure will be followed here. 
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3.2 Results 

We begin with the decaying field and consider first the normal stress element 
Tn. To limit the scope of the search, we initially restrict the predictor variables to 
quantities filtered at scale r. Furthermore, since 5,, = 0, we eliminate S33 from the 
search, reducing the list to 11 variables. 

The main result is the following. The ppreg algorithm finds only one projection 
(in which the variance of the data around a mean trend is reduced), namely that 
corresponding to Sn. The coefficient a corresponding to Sn is close to unity, while 
all others are less than 0.1. The same is true for all other tensor elements, i.e. the 
only causal dependence appears to be between corresponding elements of r x j and 
Sij. The smoothed dependence is approximately linear, but the variance about it is 
still very large. The correlation coefficient between each element of the stress and 
rate-of-strain tensors is, averaging over all 6 elements, about p = 0.26. Notice that 
the Smagorinsky model requires the product between each rate-of-strain element 
and the second invariant II§. Given the discussion in section 2, this could have 
been detected by the present approach by yielding pairs of projections with similar 
|a|’s for both II $ and S tJ and canceling parabolic dependences. However, such 
projections were found to produce more variance than the ones corresponding to 
constant eddy viscosity. This was checked a posteriori by computing the correlation 
coefficient between each element r x j and the corresponding term — II§Sij. The 
correlation was marginally smaller than for Sij alone, about p = 0.25 on average. 

The same procedure was repeated for the forced isotropic flow, and the same 
observations were made. The correlation between r,j and Sij was even lower (about 
0.12 instead of 0.26), but this was again the only causal dependence captured by 
the algorithm. All other projections did not reduce the variance in any fashion, and 
correlations with —Il^Sij were again smaller than with alone. 

Inclusion of the velocity gradients filtered at a larger scale yielded projections 
that include a weak linear dependence on these gradients but again in terms of the 
same tensor elements only. In other words, for T\\ the ‘best’ (and only) projection 
is onto Sn + 0.2Sn. Nevertheless, this leaves the correlation virtually unchanged 
since Sn and Sn are themselves correlated. Similar results were obtained for other 
tensor elements. 

We also considered the possibility that the sgs stresses depend not only on the 
resolved velocity gradients at the point in question, but at the 26 closest neighboring 
grid-points as well. To do this, the 6 elements of Sij at each point Sij(x + i x r, y + 
z y r, z + z*r); z*, i y , i z = —1,0, 1, as well as 3 vorticity components at each of these 
points was considered. The dimensionality of the space of these predictor variables 
is 243. It appears unrealistic to expect ppreg to perform adequately in such extreme 
circumstances. In order to at least explore this direction, we considered r n (:r,y, z) 
and investigated how it depends on the first element of the rate-of-strain tensor at all 
27 neighboring points on the coarse grid, i.e. the predictor variables were S\\(x + 
i x r,y + i y r,z + z z r), = —1,0,1. The projection pursuit projected again 

most strongly on Sn(x,y, z) ( a = 0.8), while the a’s corresponding to neighboring 
points were below 0.25. Inclusion of these weak dependencies left the correlation 
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coefficient virtually unchanged. Since this test is incomplete (one should include all 
243 elements in the test) the conclusion that the neighboring velocity gradient does 
not affect the sgs stresses is somewhat premature. Nevertheless, the partial results 
obtained here give no indication of any substantial influence. 

3.3 The model of Bardina et al. 

The only model which has been reported to yield high correlations when tested 
with DNS data is the model of Bardina et al. (1983). The correlation between 
Tij and Bij = uitlj — UiUj can be as high as 0.7 to 0.8 when the filter used in 
creating the synthetic large eddy field from the DNS data is Gaussian. In spite of 
this, experience shows that when the model is implemented in actual simulations, 
it dissipates almost no energy, and a Smagorinsky term has to be added (giving 
the mixed model, Bardina et al 1983). This is puzzling since a high correlation 
implies at least some alignment between the modeled stress and rate of strain tensor 
required for dissipation. This issue is addressed below. 

Using a Gaussian filter on the decaying isotropic data, we reproduced the quoted 
correlation of 0.8. We found this result to be misleading, however, since the Gaus- 
sian filter produces a ‘large-scale’ field that contains considerable contributions from 
the ‘small scales’, as viewed from a spectral analysis. This ‘small scale’ information 
is, of course, not available in an actual large eddy simulation if a spectral method is 
used. The model of Bardina et al. can be viewed as a procedure for extracting the 
‘small scale’ component of the synthetic large eddy velocity field generated from the 
DNS data. While this procedure yields impressive correlations in tests with DNS 
data, lack of the ‘small scale’ component in an actual large eddy simulation field 
results in a model that may yield a very poor estimate for the real stresses. The 
near lack of dissipation is probably symptomatic of this. 

This hypothesis was tested by experimenting with different filters. We feel that 
the cut-off filter is the most appropriate for generating the synthetic large eddy 
field since it completely eliminates the ‘small scale’ information that will never be 
present in a spectral large eddy simulation. We have repeated the tests of the model 
of Bardina et al using a cut-off filter to determine Ui . The second filtering, tT*, was 
chosen either to be Gaussian or a second cutoff at a scale twice as large as r. Using 
this scheme, the model of Bardina et al is written as 

B*j = u,u } - iii tij, (9) 

As expected, the correlation between the sgs stress and the Bardina model dropped 
to nearly zero when the cut-off filter was used to generate Uj. This was true inde- 
pendent of the second filter type (iT,). As a consistency check, we found that when 
B*j was included in the projection pursuit as predictor variable, no dependence on 
this tensor was found. 

4. Homogeneous sheared turbulence 

In this section, we search for correlations between sgs stresses and resolved vari- 
ables in homogeneous shear flow. The data was generated by Rogers (1987) on a 
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128 3 mesh using a variant of the Rogallo code. We considered three different re- 
alizations, corresponding to times 10, 12, and 14, in units of the inverse imposed 
mean shear, 5 =< du\jdx 2 >. The mean velocity is in the x\ direction and the 
mean rotation in the x 3 direction. Cutoff filtering was performed on a scale r = 4 
grid-points, and every eighth grid point was sampled, as in section 3. The list of 
predictor variables was again 5n, S 22 , 5i2, *523, 5 i 3 , & i, & 2 , ^ 3 , Hs' anc ^ 

// ^ = |u;|. Ppreg was repeated 6 times for each element of the sgs stress tensor. 

4.1 Results 

In contrast to the tests performed in isotropic turbulence, ppreg was able to 
find several interesting projections in the case of homogeneous shear flow. Table 
1 shows the individual tensor elements and the linear combination of predictor 
variables z = a, ij that dominate the projections (chosen as those whose a > 
0.15). The functional dependence on each projection (f vs z ) was found to be fairly 
linear. The correlation coefficients between T{j and —IlgSij (Smagorinsky model) 
are contrasted in the same table with those between and the dominant elements 
of the linear combinations found. On average, there is about a 100% improvement 
above the Smagorinsky model. 


Stress 

z, (best projection) 

^[r i> ,-//5 IJ ] 

P[r„-A 

T\\ 

-0.39^1 1 -f 0 . 4 lS 2 2+0.73Si2*f0.28S ia +0. 17^2+0,15^3 

0.23 

0.36 

7"22 

-0.2l5n-0.28S 2 2-0.895i2-0.175i3 

0.14 

0.23 

?33 


0.07 


T\7 


0.13 


^23 


0.06 

0.27 

Tl3 

0.155, 2-0. 2^23-0. 565, 3 +O. 68 W 1 +0.29^2-0. 17i>3 

0.21 

0.34 


TABLE 1. Results of projection pursuit for homogeneous shear flow. 

It can be appreciated that causal relations exist that are significantly different 
from the Smagorinsky model. The coefficients showed only minor variations for the 
other two times considered (St=10 and St=14). This robustness suggests that there 
is a physical mechanism by which the large-scale field consistently influences the sgs 
stresses, in addition to what is required energy transfer (i.e. alignment between 
and Stj). Since the relations tabulated above cannot by themselves provide an 
adequate relation between tensors, it could be that dependence on other quantities 
has been omitted. The next section explores the dependence on other quantities 
that may provide possible mechanisms for the observed degree of causality. 
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4.2 Dependence on mean shear and modeling 

An important consideration when developing a model for the sgs stress is that the 
resulting model be in the form of a frame invariant tensor. Clearly, the individual 
terms found in the previous subsection are not invariant under rotations of the 
coordinate system. A tensorial relation must be found that is consistent with the 
findings of ppreg on each tensor element. We attempt to find such a tensorial 
relation in this section. To do this, we first observe that S \ 2 and uj 3 are important 
contributors in the model for t\\. These tensor elements of Sij and Rij = — \£ijk&k 
also correspond to the only non-zero elements in the mean strain-rate and mean 
rotation tensors. This fact suggests that tensor products of the mean strain rate 
and mean rotation with the fluctuating strain rate and fluctuating rotation would 
reproduce some of the dependence found by ppreg for rn. Analogous reasoning 
holds for most of the other elements of r. 

To proceed further, we define the mean strain and rotation rate tensors as 

/o f 0\ 

Ey = f 0 0 (10) 

\0 0 0 / 

/ 0 f 0\ 

fly = -f 0 0 ( 11 ) 

\ 0 0 0 / 

and postulate a model of the following form: 

= — 2c\r 2 IlgSij + 

C2r 2 (Sik^kj + 'EikSkjY + c 3 r 2 (Rikflkj + QikRkj)* + ( 12 ) 

C4 r 2 (Sik£lkj ~ ftikSkj) + csr 2 (Rik'Ekj — 'ZikRkj), 

where ()* indicates trace- free part (note that some of the terms are naturally trace- 
free). The first (Smagorinsky) term is also present in the ppreg results and thus is 
included here. To see more clearly how this model reproduces some of the trends 
of Table 1, note, for instance, that the [11] element of the product between S t j and 
Sjj is linear in Su and that between Rij and Qjj will be linear in u? 3 . Again, similar 
correspondences can be found for other elements of r. 

Since Eq. (12) is linear in the coefficients c,, these can be determined by the usual 
least-squares technique. This procedure is easily derived as follows. Write Eq. (12) 
symbolically as 

7"»j> == C)t(m j(.)jj , (13) 

where (m*)ij is the kth trace-free model tensor in Eq (12). When the DNS data is 
used, the above expression can be compared with the exact trace-free part of the 
subgrid-scale stress, (r* x )^. The error in representing the stress via Eq. (13) is 
given by 
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e >j = ( T ex)ij - Ck(m k )ij. (14) 

Assuming the c* to be constant in space, the global square error is minimized with 
respect to the c k by enforcing the following condition 


w k < e ° Ci> >_ °’ 


(15) 


where <> indicates an average over space. This operation leads to the following 
matrix equation for the c* 


Note that this procedure is rather general and does not require that all five terms 
in Eq (12) be included. Any subset of the five terms can used as a basis and 
corresponding coefficients solved for via Eq. (16). This feature will be used to 
determine which combinations of the five terms are most effective in maximizing 
the correlation between the modeled and exact subgrid-scale stresses. 

The quality of the fit is measured in terms of the tensorial correlation coefficient 


< ( T ex > 

\J < ( r ex)u ( T ex )*> MijMij > 

where Mij = c*(m*)ij is the composite model tensor. 

The procedure developed above was applied to the homogeneous shear flow data 
base. Correlation coefficients were determined for all possible combinations of one 
to five model components. Figure 4 shows the results of this study where the high- 
est correlation coefficient obtained for a given number of model tensors is plotted 
against the number of tensors used. 

The correlation increases as more model tensors are included. The increment 
in improved correlation, however, deceases as more terms are added. In fact, the 
correlation coefficient when just three terms are used is nearly identical to that 
when all five terms are included. This fact suggests that at least two of the terms in 
Eq. (12) axe not particularly useful. The relative importance of the various terms 
are summarized in Table 2, where the optimal combinations of terms that give rise 
to the correlations in Figure 4 are listed. 

Note that when only one term is used, the optimal choice is not the Smagorinsky 
model (term 1), but rather term 4, r 2 (Sjjtfijt; — SlikSkj)- For reference, the cor- 
relation produced by the Smagorinsky model alone is shown as the square symbol 
in Figure 4. The Smagorinsky model is seen to be only slightly inferior to term 4. 
When two or more terms are included, the Smagorinsky model is always present. 
Terms 2 and 5 enter the list in the last two positions and do not significantly im- 
prove the correlation. It is interesting to note that both of these terms contain the 
mean shear. It is also interesting that terms 3 and 4 are proportional to the mean 
rotation, and it is these terms that are most effective in increasing the correlation. 
This point will be discussed further in the following section. 
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FIGURE 4. Correlation coefficient between the exact homogeneous shear flow sgs 
stress and subsets of the terms in the model of Eq. (12). For a given number of 
model tensors, the correlation coefficient plotted is the highest one obtained when 
all possible combinations of the five terms was considered. 


Number of terms 

Best combination 

i 

4 

2 

1,4 

3 

1,3,4 

4 

1, 2, 3, 4 

5 

1, 2, 3, 4, 5 


TABLE 2. Optimal subsets of the model terms in Eq. (12) applied to homoge- 
neous shear flow. 


When terms 1, 3, and 4 are used, there is slightly more than a 50% improvement 
over the Smagorinsky model. This compares with roughly 100% improvement for 
the tensorially incorrect model listed in Table 1. This discrepancy is due to the fact 
that not all of the terms contained in Table 1 can be reproduced by the model of 
Eq. (12). Nevertheless, a simple tensorially correct model was found that captures 
some of the trends found by projection pursuit. 

The coefficients of terms 1, 3, and 4 are 8.52 x 10“ 3 , —3.03 x 10~ 2 , and 4.16 x 10“ 2 
respectively. 
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5. Channel flow 

In this section we consider DNS of channel flow. The data were generated with a 
pseudo-spectral code as detailed in Kim et al (1987). The Reynolds number based 
on the wall friction velocity was 395, and 256 x 193 x 192 grid points were used. 
The data was cutoff filtered in the streamwise and spanwise directions only, with 
a filter size of four grid cells. All results presented here were generated on a single 
plane of data at ^ - 0.126 ( y + = 49.8 ) where h is the channel half width. 

5.1 Results 

As a starting point, the correlation between the exact stresses and the Smagorin- 
sky model was investigated. Individual correlation coefficients were computed for 
each of the tensor elements, and these were found to be very low ( p 0.07 on 
average). This trend is shown in Figure 5 in the form of a scatter plot of r 12 vs 
—5'12- 



Figure 5. Scatter plot of sgs stress r\ 2 as a function of rate-of-strain element S\ 2 
in DNS of channel flow, at y/h = 0.126. 

No causal dependence appears to exist between the two variables. The ppreg 
algorithm was then applied to the data using all elements of Sij, £i)k and the invari- 
ants as 11 predictor variables. A sequence of several projections was found for each 
element of ry. This list of projections was reduced to a single one for each element 
of r u by retaining the one that reduced the variance most strongly in each case. 
These optimal projections axe listed in Table 3. 

As opposed to the results for homogeneous shear flow, the functional form between 
Tij and these projections was found to be non-linear. As an illustration, a scatter 
plot between r 12 and the projection z \ 2 = 0.77523 + 0.395i3 — 0.3uq is shown in 
Figure 6. The main trend of Ti 2 as a function of z \ 2 appears to be quadratic. 
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Stress 

z, (best projection) 

^( r »> » — II Sij ] 


rn 

—0.85x3 + 0.51 uj 2 

0.12 

0.44 

T 22 

-0.4S 22 - 0.8627s 

0.10 

0.21 

^33 

0.45523 + 0.855x3 + 0.17u)2 

0.02 

0.36 

Tl2 

0.775 2 3 + 0.395x3 “ 0.3cDx 

0.10 

0.34 

?23 

— 0.5l5n — 0.435 2 2 “1" 0.355x2+ 

0.475 2 3 — 0.295x3 ~ 0.17(t5x — u) 2 + ^ 3 ) 

0.05 

0.27 

rn 

0.825n + 0.485 22 0.23u)2 

0.05 

0.31 


TABLE 3. Results of projection pursuit for channel flow. 



FIGURE 6. Scatter plot of sgs stress tj 2 as a function of best projection onto 
elements of filtered velocity gradient tensor, found by ppreg. Same data as in 
Figure 5. 

Similar behavior was found for all other tensor elements, the trend being strongest 
for the [11], [33], [12], and [13] components. To quantify the causality between the 
stresses and the corresponding z* s, the correlation coefficients between individual 
elements of r and the corresponding z 2 were computed. Since each of the observed 
quadratic trends had a minimum close to the origin, it is sufficient to consider the 
single term z 2 . These correlation coefficients appear in the last column of Table 
3. Notice that when compared with the correlation produced by the Smagorinsky 
model, more than a four-fold increase is detected. This trend can also be observed 
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by comparing Figure 5 to Figure 6. 

5.2 Modeling 

As in section 4.2, the expressions in Table 3 are by themselves not valid relations 
between tensors. Unlike the linear relationship found in shear flow, the elements of 
r depend quadrat ically on the projections in channel flow. For example, the model 
for t \2 is (0.77523 + 0.395 i 3 - 0.3aq) 2 . The tensor model found for homogeneous 
shear flow may not be of much use in this case since it is not able to produce the 
non-linear products that result from squaring the projection. The quadratic non- 
linearities suggest that it may be possible to model the stresses in terms of various 
tensor products of the strain and rotation rates. Such a model is 


r*j = - 2c\r 2 H§S,j + 

c 2 r 2 (S ik S kj y + c 3 r 2 (R lk R kj y + (18) 

c 4 r 2 (S, k R kj - Ri k S k j) + c s r 2 (S ik S k ,R,j - R ik S k ,S,j)/II s , 

where ()* again indicates trace- free part. This model was studied by Lund and 
Novikov (1992) and represents the most general relation between the subgrid-scale 
stress and the strain and rotation rate tensors. 

The least-squares fitting procedure was applied to the above model as well as the 
model of Eq. (12). The resulting correlation coefficients are shown in Figure 7. 



FIGURE 7. Correlation coefficients for the terms in the models of Eqs (12) and 
(18) applied to channel flow data. 

As expected, the model developed for shear flow (Eq. (12)) does not offer much 
improvement here. The non-linear model of Eq (18), on the other hand, considerably 
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Number of terms 

Best combination 

i 

4 

2 

3,4 

3 

1, 3,4 

4 

1, 3, 4, 5 

5 

1, 2, 3, 4, 5 


TABLE 4. Optimal subsets of the model terms in Eq. (18) applied to channel 
flow. 

increases the correlation relative to the Smagorinsky model. As in the shear flow 
case, only two or three terms contribute significantly to the increase in correlation 
coefficient. The ranking of terms in Eq. (18) is summarized in Table 4. 

As in the shear flow, the Smagorinsky model does not produce the highest corre- 
lation when used in isolation. In fact, it produces the lowest correlation of any single 
term, while term 4, the best one, produces a correlation coefficient that is roughly 
2.7 times higher! Furthermore, the Smagorinsky model does not enter the list until 
three or more terms are included and adds little to the correlation at that point. 
When three terms are included, the correlation coefficient is about 3.2 times higher 
than that provided by the Smagorinsky model. This compares with an average of 
improvement of a factor of 4.4 obtained by the projection pursuit algorithm. Thus 
the model of Eq. (18) incorporates quite well the findings of projection pursuit into 
a tensorially correct model. 

As in the shear flow, the rotation rate enters as an important parameter. In both 
the shear and channel flow, the best single term is the product of strain and rotation 
rate (actually, it is the mean rotation in the shear flow and the local rotation in 
the channel flow). The observed strong dependence on this term is perhaps not 
too surprising since it is representative of vortex stretching. Although there is a 
connection with vortex stretching, the strain, rotation product does not remove 
energy from the large scales ( i.e. (SikRkj - Rik§kj)Sij = 0). Thus, by itself, this 
term would not be a useful model, and a term that has a non-zero projection on 
the strain rate (such as the Smagorinsky model) must be added. 

The coefficients of terms 1, 3, and 4 are 1.13 x 10~ 3 , — 1.38xl0“ 2 , and -8.71 x 10“ 3 
respectively. 

The above collection of predictor variables is by no means exhaustive. Examples 
of other dependencies that could have been included are the mean velocity gradients 
E* ; and and the distance from the wall A* (a vector). 

6, Summary and conclusions 

A novel regression algorithm has been used to explore DNS data in an effort 
to determine improved models that parameterize the sgs stresses for large eddy 



Sgs parameterization by projection pursuit 


79 


simulation. In addition to the rate of strain, several other variables have been 
considered. These include rotation, velocity gradients filtered at larger scales, and 
velocity gradients at neighboring points as well as the invariants of the strain and 
rotation rate tensors. DNS data from isotropic turbulence, homogeneous shear flow, 
and turbulent channel flow have been considered. 

For isotropic turbulence, no statistically robust relations were found other than 
the small correlation between the stress and rate-of- strain tensor required for energy 
transfer. This finding may imply that, other than the weak relation between the 
stress and rate of strain, the large-scale velocity gradients in isotropic flow do not 
dictate the behavior of the small scales giving rise to the sgs stresses. Given that 
the ppreg algorithm is not guaranteed to find all existing trends, we can not state 
this conclusion with absolute certainty. Nevertheless, it is very likely that for the 
Reynolds number range considered, there is no strong, simple connection between 
large scale velocity gradients and sgs stresses in isotropic turbulence. 

Entirely different behavior was observed in turbulent shear flows. Individual 
components of the stress tensor were found to depend on several elements of the 
fluctuating strain and rotation rate tensors. The dependence was found to be linear 
in the case of homogeneous shear flow and quadratic in the case of channel flow. 
In the case of homogeneous shear flow, the observed dependence was used to guide 
the construction of a model that involved tensor products of the mean strain and 
rotation rate with their fluctuating counterparts. This model was shown to produce 
a correlation between modeled and exact stresses that was 50% higher than that 
given by the Smagorinsky model. The proposed model for homogeneous shear 
flow did not carry over to channel flow, and only marginal improvement over the 
Smagorinsky model was observed. The results of projection pursuit were again used 
to guide the construction of a model for channel flow. This model was considerably 
more successful, yielding more than a 200% improvement over the Smagorinsky 
model. Whereas the shear flow model did not extend well to channel flow, the 
channel flow model did perform reasonably well in shear flow, yielding correlations 
that were roughly 90% of those achieved with the shear flow model. 

One interesting finding of this work is that mean strain and rotation rates enter 
in the parameterization of the subgrid-scale stresses, at least in the case of homoge- 
neous shear flow. This is at variance with the view that at large Reynolds numbers 
the small scales should be nearly isotropic and unaligned with the large-scale mo- 
tions (Kolmogorov, 1941). Indeed, recent experimental measurements of Saddoughi 
(1992) confirm small-scale isotropy at high i?e. Of course, the low Reynolds num- 
ber data used here does not provide a sufficient range of scales to realize small 
scale isotropy, and, consequently, the subgrid scales have some residual alignment 
with the mean gradients. It is thus conceivable that the observed dependence on 
the mean quantities would disappear if the Reynolds number and hence the scale 
separation were increased. 

On the other hand, it is not clear that traditional measures of isotropy (spectra, 
structure functions etc.) have a direct connection with the behavior of the sgs 
stresses. Alternately, the observed dependence on the mean quantities could also 
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be present in a slightly different form at higher Reynolds number. In this view, the 
shear and rotation produced by large scales of size, say, br (where r is the filter size 
and b > 2) may take on the role of mean shear and rotation as far as the small 
scales are concerned. We did not find such trends in the isotropic flow using 6 = 2, 
but it is possible that such a trend requires large separation (b » 2) and higher 
Re. Unfortunately, this issue cannot be addressed using DNS data at low Reynolds 
numbers. 
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